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Abstract 



°: 

. In-place associative integer sorting technique was developed, improved and spe- 

£/3 1 cialized for distinct integers. The technique is suitable for integer sorting. Hence, 

[/], given a list S of n integers S[0...n — 1], the technique sorts the integers in as- 

O 

cending or descending order. It replaces bucket sort, distribution counting sort and 
address calculation sort family of algorithms and requires only constant amount of 
additional memory for storing counters and indices beside the input list. The tech- 
nique was inspired from one of the ordinal theories of "serial order in behavior" and 
explained by the analogy with the three main stages in the formation and retrieval 



of memory in cognitive neuroscience: (i) practicing, (ii) storing and (iii) retrieval. 
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In this study in-place associative permutation technique is introduced for integer 
key sorting problem. Given a list S of n elements S[0 . . . n — 1] each have an integer 
key in the range [0, m— 1], the technique sorts the elements according to their integer 
keys in 0(n) time using only 0(1) amount of memory if m <= n. On the other 
hand, if m > n, it sorts in 0(n + m) time for the worst, 0(m) time for the average 
(uniformly distributed keys) and 0(n) time for the best case using 0(1) extra space. 

keywords: associative sort, permutation sort, stimulation sort, linear time sorting. 



1 Introduction 

Nervous system is considered to be closely related and described with the "serial order in 
behavior" in cognitive neuroscience [HE] with three basic theories which cover almost all 
abstract data types used in computer science. These are chaining theory, positional theory 
and ordinal theory [3]. 

Chaining theory is the extension of stimulus-response (reflex chain) theory, where 
each response can become the stimulus for the next. From an information processing 
perspective, comparison based sorting algorithms that sort the lists by making a series 
of decisions relying on comparing keys can be classified under chaining theory. Each 
comparison becomes the stimulus for the next. Hence, keys themselves are associated 
with each other. Some important examples are quick sort [3], shell sort [5], merge sort [6] 
and heap sort [7]. 

Positional theory assumes order is stored by associating each element with its position 
in the sequence. The order is retrieved by using each position to cue its associated element. 
This is the method by which conventional (Von Neumann) computers store and retrieve 
order, through routines accessing separate addresses in memory. Content-based sorting 
algorithms where decisions rely on the contents of the keys can be classified under this 
theory. Each key is associated with a position depending on its content. Some important 
examples are distribution counting sort j8j [9], address calculation sort [T0] - [T5] . bucket 
sort rjBl[r7] and radix sort [ToT - TiT?] . 

Ordinal theory assumes order is stored along a single dimension, where that order is 
defined by relative rather than absolute values on that dimension. Order can be retrieved 
by moving along the dimension in one or the other direction. This theory need not assume 
either the item-item nor position-item associations of chaining and positional theories 
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respectively [3]. 

One of the ordinal theories of serial order in behavior is that of Shiffrin and Cook 
[20] which suggests a model for short-term forgetting of item and order information of 
the brain. It assumes associations between elements and a "node", but only the nodes 
are associated with one another. By moving inwards from nodes representing the start 
and end of the sequence, the associations between nodes allow the order of items to be 
reconstructed [3J. 

As in the ordinal model of Shiffrin and Cook, in-place associative integer sorting 
technique [2T1 - I21] assumes that associations are between the integers in the list space and 
the nodes in an imaginary linear subspace that spans a predefined range of integers. The 
imaginary subspace can be defined anywhere on the list space S[0 ... n — 1] provided that 
its boundaries do not cross over that of the list making the technique in-place, i.e., beside 
the input list, only a constant amount of memory locations are used for storing counters 
and indices. Hence, moving through nodes that represent the start and end of practiced 
integers as well as retaining their relative associations with each other even when their 
positions are altered by cuing allow the order of integers to be constructed in-place in 
linear time. 

Another ordinal theory is the original perturbation model of Estes [25] • Although 
proposed to provide a reasonable qualitative fit of the forgetting dynamics of the short 
term memory [3J in cognitive neuroscience, the idea behind the method is that the order 
of the elements is inherent in the cyclic reactivation of the elements, i.e., reactivations 
lead to reordering of the elements. 

When the idea behind the perturbation model is combined with the original technique 
of associative sorting, in-place associative permutation technique is obtained where the 
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order of the practiced interval is inherent in the cyclic reactivation of the elements of 
the list. Practicing phase of associative sorting is revised and when the elements of the 
list are reactivated with a special form of cycle leader permutation, a temporal state 
is obtained that can be either stored in short (or long) term memory or restored into 
sorted permutation of the practiced interval in linear time using 0(1) amount of memory. 
Therefore, in-place associative permutation sort technique is obtained that consists of 
three phases namely (i) practicing, (ii) permutation and (hi) restoring. 

1.1 Original Technique: In-place Associative Integer Sorting 

The main difficulties of all distributive sorting algorithms is that, when the integers are 
distributed using a hash function according to their content, several integers may be 
clustered around a loci, and several may be mapped to the same location. These problems 
are solved by inherent three basic steps of associative sort [21] namely (i) practicing, (ii) 
storing and (iii) retrieval. 

It is assumed that associations are between the integers in the list space and the nodes 
in an imaginary linear subspace (ILS) that spans a predefined range of integers. The ILS 
can be defined anywhere on the list space S[0 . . . n — 1] provided that its boundaries do not 
cross over that of the list. The range of the integers spanned by the ILS is upper bounded 
by the number of integers n but may be smaller and can be located anywhere making the 
technique in-place, i.e., beside the input list, only a constant amount of memory locations 
are used for storing counters and indices. An association between an integer and the ILS 
is created by a node using a monotone bijective hash function that maps the integers in 
the predefined interval to the ILS. The process of creating a node by mapping a distinct 
integer to the ILS is "practicing a distinct integer of an interval" . Once a node is created, 
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the redundancy due to the association between the integer and the position of the node 
releases the word allocated to the integer in the physical memory except for one bit which 
tags the word as a node for interrogation purposes. The tag bit discriminates the word as 
node and the position of the node lets the integer be retrieved back from the ILS using the 
inverse hash function. This is "integer retrieval" . All the bits of the node except the tag 
bit can be cleared and used to encode any information. Hence, they are the "record" of 
the node and the information encoded into a record is the "cue" by which cognitive neuro- 
scientists describe the way that the brain recalls the successive items in an order during 
retrieval. For instance, it will be foreknown from the tag bit that a node has already been 
created while another occurrence of that particular integer is being practiced providing 
the opportunity to count other occurrences. The process of counting other occurrences 
of a particular integer is "practicing all the integers of an interval", i.e., rehearsing used 
by cognitive neuro-scientists to describe the way that the brain manipulates the sequence 
before storing in a short (or long) term memory. Practicing does not need to alter the 
value of other occurrences. Only the first occurrence is altered while being practiced from 
where a node is created. All other occurrences of that particular integer remain in the 
list space but become meaningless. Hence they are "idle integers". On the other hand, 
practicing does not need to alter the position of idle integers as well, unless another distinct 
integer creates a node exactly at the position of an idle integer while being practiced. In 
such a case, the idle integer is moved to the former position of the integer that creates 
the new node. This makes associative sort unstable, i.e., equal integers may not retain 
their original relative order. 

Once all the integers in the predefined interval are practiced, the nodes dispersed 
in the ILS are clustered in a systematic way closing the distance between them to a 
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direction retaining their relative order. This is the storing phase of associative sort where 
the received, processed and combined information to construct the sorted permutation of 
the practiced interval is stored in the short-term memory. When the nodes are moved 
towards a direction, it is not possible to retain the association between the ILS and list 
space. However, the record of a node can be further used to encode the absolute (former) 
position of that node as well, or maybe the relative position or how much that node 
is moved relative to its absolute or relative position during storing. Unfortunately, this 
requires that a record is enough to store both the positional information and the number 
of idle integers practiced by that node. However, as explained earlier, further associations 
can be created using the idle integers that were already practiced by manipulating either 
their position or value or both. Hence, if the record is enough, it can store both the 
positional information and the number of idle integers. If not, an idle integer can be 
associated accompanying the node to supply additional space for it for the positional 
information. 

Finally, the sorted permutation of the practiced interval is constructed in the list 
space, using the stored information in the short-term memory. This is the retrieval phase 
of associative sort that depends on the information encoded into the record of a node. 
If the record is enough, it stores both the position of the node and the number of idle 
integers. If not, an associated idle integer accompanying the node stores the position of 
the node while the record holds the number of idle integers. The positional information 
cues the recall of the integer using the inverse hash function. This is "integer retrieval" 
from imaginary subpace. Hence, the retrieved integer can be copied on the list space as 
many as it occurrs. 

Hence, moving through nodes that represent the start and end of practiced integers 
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as well as retaining their relative associations with each other even when their positions 
are altered by cuing allow the order of integers to be constructed in linear time in-place. 

From complexity point of view, associative sort shows similar characteristics with 
bucket sort [16l[T7j and distribution counting sort [HE]. It sorts n integers S[0 . . .n — 1] 
each in the range [0, m — 1] using 0(1) extra space in Oin + m) time for the worst, 0(m) 
time for the average (uniformly distributed integers) and 0(n) time for the best case. The 
ratio — defines the efficiency (time-space trade-offs) of the algorithm letting very larges 
lists to be sorted in-place. 

1.2 In-place Associative Permutation Sort 

In this study, associative sorting technique is revised combining the ideas behind the ordi- 
nal theories of Shiffrin and Cook [20] and the original perturbation model of Estes [25]. In 
perturbation model of Estes, the order is inherent in the cyclic reactivation of elements. 
When both theories are combined, in-place associative permutation sort technique is ob- 
tained where the order of the practiced interval is inherent in the cyclic reactivation of the 
elements of the list. Hence, reactivating (stimulating) the list with a special form of cycle 
leader permutation result in a temporal state that can be either stored in short (or long) 
term memory or restored into sorted permutation of the practiced interval all in linear 
time using 0(1) amount of memory with three phases: (i) practicing, (ii) permutation 
(stimulation, reactivation) and (iii) restoring. 

Practical comparisons for lists up to one million integer keys all in the range [0, n— 1] on 
a Pentium machine with radix sort and bucket sort indicate that associative permutation 
sort is slower roughly 2 times than radix sort and slower roughly 3 times than bucket sort. 
On the other hand, it is faster than quick sort for the same lists roughly 1.5 times. 
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Although its time complexity is similar to that of in-place associative sort [21] and 
practically slower, in-place associative permutation sort is proposed for integer key sorting 
problem. Hence, it sorts n elements S[0 . . . n — 1] each have an integer key in the range 
[0, m — 1] with m> n using 0(1) extra space in 0(n + m) time for the worst, 0(m) time 
for the average (uniformly distributed keys) and 0(n) time for the best case. 

2 Definitions 

Given a list S of n elements, S[0], S[l], . . . , S[n — 1] each have an integer key, the problem 
is to sort the elements of the list according to their integer keys. To prevent repeating 
statements like "key of the element S[i]" , S[i] is used to refer the key. The notations used 
throughout the study are: 

(i) Universe of integer keys is assumed U = [0 . . . 2 W — 1] where w is the fixed word 
length. 

(ii) Maximum and minimum keys of a list are, max(S') = max(a|a £ S) and min(S') = 
min(a|a £ S), respectively. Hence, range of the keys is, m = max(S') — min(S') + 1. 

(iii) The notation B C A is used to indicated that B is a proper subset of A. 

(iv) For two lists S\ and S 2 , max(Si) < min(S' 2 ) implies Si < S 2 - 

Universe of Keys. When a key is first practiced, a node is created releasing w bits of 
the key free. One bit is used to tag the word as a node. Hence, it is reasonable to doubt 
that the tag bit limits the universe of keys because all the keys should be untagged and 
in the range [0, 2 W ~ 1 — 1] before being practiced. But, we can, 

(i) partition S into 2 disjoint sublists Si < 2 w ~ l < S 2 in 0(n) time with well known 
in-place partitioning algorithms or stably with [27] , 
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(ii) shift all the keys of £2 by — 2 W 1 , sort Si and S2 with associative permutation sort 
technique and shift S2 by 2 W ^ 1 . 

There are other methods to overcome this problem. For instance, 

(i) sort the sublist S[0 . . . (n/logn) — 1] using the optimal in-place merge sort [28], 

(ii) compress S[0 . . . (n/ logn) — 1] by Lemma 1 of [29] generating Q(n) free bits, 

(hi) sort S[(n/ logn) . . . n — 1] with in-place associative permutation sort technique using 

Q(n) free bits as tag bits, 
(iv) uncompress S[0 . . . [nj logn) — 1] and merge the two sorted sublists in-place in linear 

time by [22] . 

Number of Keys. If practicing a distinct key lets us to use w — 1 bits to practice other 
occurrences of that particular key, we have w — 1 free bits by which we can count up to 
2 W ~ 1 occurrences including the first key that created the node. Hence, it is reasonable to 
doubt again that there is another restriction on the size of the lists, i.e., n < 2 W ~ 1 . But 
a list can be divided into two parts in 0(1) time and those parts can be merged in-place 
in linear time by [2B] after sorted associatively. 

Hence, for the sake of simplicity, it will be assumed that n < 2 W ~ 1 and all the keys 
are in the range [0, 2 U,_1 — 1] throughout the study. 

3 Associative Permutation Sort 

Given n distinct integer keys SfO . . .n — 1] each in the range [u,v], if m = n, the sorted 
permutation of the list can be represented with two parameters (2 log U bits) one of which 
is the initial address of the sequential memory separated for the list (accessed by S[0]) 
in the RAM and the other is the 5 = u. The ith key of the sorted list can be calculated 
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by S[i] = i + 8 and the actual value at z'th location is meaningless for this calculation. 
Hence, if S is the sorted permutation, then there is a bijective relation between each key 
and its position, i.e., i — S[i] — 8. From contradiction, if S is not the sorted permutation, 
i 7^ S[i] — 8 implies that the key S[i] is not at its exact location. Its exact location can be 
calculated with j — S[i] — 8. Therefore, this monotone bijective hash function that maps 
the keys to j G [0, n — 1] can sort the list in 0(n) time using 0(1) constant space. This is 
cycle leader permutation where S is re-arranged by following the cycles of a permutation 
7r. First S[0] is sent to its final position 7r(0) (calculated by j = S[i] — 8). Then the 
element that was in 7r(0) is sent to its final position tt(tt(0)). The process proceeds in this 
way until the cycle is closed, that is until the key to position is found which means that 
the association = S[0] — 8 is constructed between the first key and its position. Then 
the iterator is increased to continue with the key of S[l}. At the end, when all the cycles 
of S[i] for 2 = 0, l..,n — 1 are processed, all the keys are moved to their exact position 
and the association i = S[i] — 5 is constructed between the keys and their position, i.e., 
the sorted permutation of the list is obtained. 

Associative permutation sort is based on this very simple idea: if the keys of a pre- 
defined interval are all distinct, then the list can be reactivated (stimulated) starting a 
special form of cycle leader permutation which rearranges the practiced interval into a 
state that can be simply restored into sorted permutation. 

An ILS is defined as lm[0...n — 1] over S[0...n — 1] if the interval of range of keys that 
it spans is exactly equal to n. Hence, an ILS lm[0 . . . n — 1] can span the keys that are in 
[8, 8 + n — 1] where 5 = min(S'). With this information, we can state that, 

Lemma 3.1. Givenn integer keys S[0 . . .n — 1], the keys that are in the range [8, 8+n — 1] 
where 8 = min(S') can be sorted at the beginning of the list in 0{n) time using 0(1) 
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constant space. 

Proof. Given n distinct integer keys S[0...n — 1] each in the range [u, v], it is not possible to 
construct a monotone bijective hash function (minimal monotone perfect hash function) 
that maps all the keys of the list into j G [0, n — 1] without additional storage space [3"U] . 
However, a monotone bijective hash function can be constructed as a partial function [31] 
that assigns each key of Si C S in the range [8, 8 + n — 1] with 5 = min(S) to exactly one 
element in j e [0, n — 1] ignoring the keys of £2 = S — S\ . The partial monotone bijective 
hash function of this form is, 

j = S[i]-6 if S\t]-5<n (3-1) 
With this definition, the proof has three basic steps of associative permutation sort: 

(i) Practice all the keys in the range [8, 8 + n — 1]. If nd nodes are created that prac- 
ticed n c idle keys, then all the keys in the practiced interval become distinct and 
consecutive in the range [0 . . . nd + n c — 1] after practicing phase ignoring the tag 
bits. 

(ii) Permute (stimulate, reactivate) the list to rearrange the practiced interval at the 
beginning of the list. 

(iii) Restore the sorted permutation of the practiced interval. This phase is only to bring 
the keys to their original values. The elements of the list (satellite data) are already 
sorted before this phase. 

□ 
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3.1 Practicing Phase 

Algorithm A. Practice all the keys in the range [8, 8 + n — 1] by mapping them into 
ILS lm[0...n — 1] over S[0...n — 1] using Eqn. I3.ll It is assumed that the minimum of the 
list 5 = min(S') is known. 

Al. initialize i = 0; 

A2. if MSB of S[i] is 1, then S[i] is a node. Hence, increase i and repeat this step; 

A3, if S[i] — 5 > n, then S[i] is a key of S2 which is out of the interval that is practiced. 
Hence, increase n' d that counts the number of keys of S2, update 8' = min(8', S[i]), 
increase % and goto to step IA2j 

A4. otherwise, S[i] is a key of Si that is to be practiced. Hence, calculate j = S[i] — 8 
(Eqn.EU); 

A5. if MSB of S[j] is 0, then S[i] is the first key that will create the node at j, hence 
move S[j] to S[i], clear S[j] and set its MSB to I. If j < i increase i. Increase rid 
that counts the number of distinct keys and hence the nodes, and goto step IA2( 

A6. otherwise, S[j] is a node that has already been created by another occurrence of S[i]. 
Hence, clear MSB of S[j], increase S[j] and set its MSB back to 1. Increase % and 
n c that counts the number of total idle keys over all distinct keys and goto step IA2( 

At the end of practicing, each record of a node keeps the number of idle keys practiced 
by that node. Hence, the record of a node at i stores one less (the node itself represents 
the first occurrence that creates the node) the number of occurrences of a key equal to 
i + 8. In total, rid nodes are created that practice n c idle keys. 
The next step of practicing phase is the accumulation. 
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Algorithm B. Sum all the records of the nodes from left to right. 
Bl. initialize % — and j = 0; 

B2. if MSB of S[i] is 1, then S[i] is a node. Hence, clear MSB of S[i], set S[i] = S[i] + j 
and j = S[i], set MSB of S[i] back, increase i and repeat this step; 



B3. otherwise, increase i and goto step 

At the end, each record of a particular node keeps the exact position of the last idle key 
practiced by that node. 

An ILS can create other subspaces and associations using the idle keys that were 
already practiced by manipulating either their position or value or both. Hence, it is 
logical to use the nodes of ILS as discrete hash functions that define the values of idle keys 
when they are re-practiced using the same monotone bijective hash function (Eqn. 13. ip . 
This is the re-practicing step that makes all the idle keys and the nodes distinct (ignoring 
the tag bits), 

Algorithm C. Re-practice all the idle keys by mapping them again to ILS lm[0...n— 1] 
over S[0...n — 1] using the same hash function (Eqn. 13. ip . When an idle key is remapped 
to its node, it will obtain its exact position (ticket) from its node which will make it 
distinct and in the range [0,n — 1]. The record of the node will be decreased by one for 
each re-practiced idle key. Hence, when all the idle keys of a node are re-practiced, the 
node will become distinct as well, in the range [0,n — 1] storing its exact position when 



its tag bit (MSB) is ignored. It should be noted that, if Algorithm A was stable, the list 



should be processed from right to left for stability as below. 
CI. initialize % — n — 1; 
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C2. if MSB of S[i] is 1, then S[i] is a node. Hence, decrease i and repeat this step; 

C3. if S[i] — 5 > n, then S[i] is a key of S2 that is out of practiced interval. Hence, 
decrease % and goto to step IC2j 

C4. otherwise, S[i] is an idle key. Hence, calculate j = S[i] — 5 (Eqn. 13. ip . clear MSB of 
S[j], copy S[j] over S[i], decrease S[j] by one and set its MSB back to 1, decrease 
i and goto step IC2t 

At the end, when the tag bits are ignored, all the keys and nodes of the practiced interval 
became distinct, in the range [0, n — 1] and their distinct values are equal to their exact 
positions in the sorted permutation. 

3.2 Permutation Phase 

In the permutation phase, all the elements of the list are reactivated (stimulated) starting a 
special form of cycle leader permutation that leads to reordering of the practiced interval 
in short-term memory (5[0. . . n<j + n c — 1]) which can be simply restored into sorted 
permutation of the original keys. It is not possible to use a simple cycle leader permutation 
because there are nodes and if their position change, the association between ILS and the 
list space is broken. However, a node has a record of w — 1 bits which at the moment 
keeps the node's exact position in the sorted permutation. Hence, when a node is moved 
to its exact position, its former position can be overwritten into its record as the cue 
that can be used to recall the key using the inverse hash function, i.e., retrieve it from 
ILS. But, how an algorithm can distinguish the nodes that are already moved, from the 
nodes that are not moved yet in such a case? The idle keys and the keys that are out 
of the practiced interval are the solution to this problem. If a node is at wrong position, 
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then it is evident that either an idle key or a key that is out of the practiced interval is 
available which will address that position. Hence, a special outer cycle leader permutation 
that reactivates only the idle keys and the keys that are out of the practiced interval will 
ensure that the corresponding one will be moved to the actual position of the node giving 
a chance to start an inner cycle leader permutation that reactivates only the nodes and 
ensures that the node will be moved to its exact position keeping its former position in 
its record as the cue. Once a node is moved to its exact position, there can not be any 
other outer cycle leader which will address that particular position. If another node is 
available where that particular node is moved, then the inner cycle leader permutation 
can continue with that node. However, if an idle key or a key that is out of the practiced 
interval is encountered, then the inner cycle leader permutation ends and the outer cycle 
leader permutation continues. 

Algorithm D. Permute (stimulate, reactivate) the list from left to right to move all the 
idle keys and the nodes to their exact positions in the short-term memory (S[0 . . . rid + 
n c — 1]) from where one can restore the original keys. For this purpose, start an outer 
cycle leader permutation that reactivates only the idle keys and the keys that are out of 
the practiced interval. When a key is found that is out of the practiced interval, move it 
to S[k] starting with k = rid + n c , and increase k every time a key out of the practiced 
interval is moved and continue with the key that was at k before. If an idle key is found, 
implicitly practice it, i.e., move it to S[j] where j = S[i]. If an idle key or a key that is 
out of the practiced interval is moved to a position where a node is there, start an inner 
cycle leader permutation that reactivates only the nodes until a new idle key or a key 
that is out of the practiced interval is encountered. Repeat this until all the idle keys of 
the list are reactivated and all the keys that are out of the practiced interval are moved 
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to S[nd + n c ... n — 1]. At the end, a list is obtained where all the distinct idle keys are in 
their exact position in the sorted permutation, i.e., they are implicitly practiced as in [2T] 
and an association is created between each idle key and its position with the monotone 
bijective hash function i — S[i], as well as the nodes are in their exact positions and each 
node precedes all the idle keys that it has practiced, from where one can obtain the exact 
values of those idle keys by retrieving the key back from ILS through that node using its 
record as the cue. 

Dl. initialize i = 0, k = + n c ; 

D2. if MSB of S[i] is 1, then S[i] is a node. Hence, increase i and repeat this step; 

D3. if i = S[i], then S[i] is an idle key that has already been practiced implicitly. Hence, 
increase % and goto step ID2( 

D4. S[i] is either an idle key not practiced yet or a key that is out of the practiced 
interval. Hence; 

(i) if S[i] is an idle key, i.e., in the range [0, n — 1], then swap S[i] with S[j] where 
j = S\i\; 

(ii) otherwise, S[i] is a key that is out of the practiced interval. Hence, swap S[i] 
with S[k], set j = k and increase k, 

D5. if MSB of S[i] is 0, then S[i] is either an idle key or a key that is out of the practiced 
interval again. Hence, goto step ID31 

D6. otherwise, S[i] is a node. Hence, start an inner cycle leader permutation. Clear 
MSB of S[i], swap S[i] with S[p] where p = S[i], encode former position j (comes 



16 



from step ID4j) of the node into its record (least significant w — 1 bits of S[p\) and 
set MSB of S\p] back to 1; 

D7. if MSB of S[i] is 1, then it is a node again. Hence, continue the inner cycle leader 
permutation, i.e., set j = p and goto step ID6j 

D8. otherwise, S[i] is either an idle key or a key that is out of the practiced interval. 
Hence, finish inner cycle leader permutation and goto step ID3t 

At the end, a list is obtained where, 

(i) all the distinct idle keys are in their exact position in the sorted permutation, i.e., 
they are implicitly practiced as in [21] and an association is created between each 
idle key and its position with the monotone bijective hash function % = S[i], 

(ii) the nodes are in their exact position and each node precedes its idle keys, from 
where one can obtain the exact values of those idle keys by retrieving the key back 
from ILS through that node using its record as the cue, 

(iii) n' d unpracticed keys are located disorderly at S[nd + n c . . . n — 1]. 
3.3 Restoring Phase 

With a final scan of the short-term memory (S[0 . . .rid + n c — l]), one can obtain the exact 
values of the practiced keys at S[0 . . . + n c — 1]. Hence, this phase is only to bring the 
keys to their original values. The elements of the list (satellite data) are already sorted 
before this phase. 

Algorithm E. Restore the sorted permutation of the practiced interval. 
El. initialize i = 0; 
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E2. if MSB of S[i] is 1, then S[i] is a node. Hence, decode the absolute position j of 
the node from its record (w — 1 bits of S[i}) and cue the recall of the key using the 
inverse hash function k = j + 5 and retrieve the key from ILS by S[z] = k. 

E3. increase i. If S[i] is in the range [0,n — 1], then it is an idle key of the preceding 
node. Hence, copy the exact value of the key by S[i] — k and repeat this step; 



E4. if MSB of S[i] is 1, then it is a new node. Hence, goto step 
E5. otherwise, S[i] is a key that is out of the practiced interval. Hence, exit. 

3.4 Binding the Loop 

After restoring, n d + n c practiced keys are sorted at the beginning of the list while unprac- 
ticed n' d keys of S 2 are distributed disorderly at S[na + n c . . . n — 1]. Hence, the structure 
of the sequential version becomes, 

Algorithm D. In each iteration, construct sorted permutation of + n c keys of S\ at 
the beginning of the list. 

Dl. find min(S') and max(S'); 

D2. initialize 5 = min(S'), 5' = max(S') and reset counters; 



D3. practice all the keys in the interval [5, 5 + n — 1] using Algorithm A to Algorithm C 



D4. permute the list using [Algorithm D 



D5. restore the sorted permutation of the practiced interval using Algorithm E 
D6. if n' d = exit. Otherwise set S = S[n d + n c . . . n — 1], n — n' d , 5 = 5', 5' = max(S), 
reset counters and goto step ID31 

Remark 3.1. Associative permutation sort technique is on-line in the sense that after each 
step ID51 n d + n c keys are added to the sorted permutation at the beginning of the list 

18 



and ready to be used. 

Practical comparisons for lists up to one million integer keys all in the range [0, n — 1] 
on a Pentium machine with radix sort and bucket sort indicate that associative permuta- 
tion sort is slower roughly 2 times than radix sort and slower roughly 3 times than bucket 
sort. On the other hand, it is faster than quick sort for the same lists roughly 1.5 times. 

Complexity of the algorithm depends on the range and the number of the keys. In each 
iteration (or recursion) the algorithm is capable of sorting keys that satisfy S[i] — 5 < n. 
Hence, given n integer keys S[0 . . .n — 1] each in the range [0,n — 1], the complexity is 
T{n) = 0{n). 

Best Case Complexity Given n integer keys S[0 . . .n — 1] each in the range [0, U], if 
n — 1 keys satisfy S[i] — 5 < n, then these keys are sorted in 0(n) time. In the next 
step, there is n' — 1 key left which implies that the sorting is finished. As a result, time 
complexity of the algorithm is lower bounded by Q(n) in the best case. 

Worst Case Complexity Given n integer keys S[0 . . . n — 1] each in the range [u, v] and 
m = v — u + 1 = (3n, if there is only 1 key available in the practiced interval at each 
iteration, in any jth step, the only key s that will be sorted satisfies s < jn — (j — 1), 
which implies that the last alone key satisfies s < jn — (j — 1) < f3n from where we obtain 
j < In this case, the time complexity of the algorithm is, 

0{n) + 0(n - 1) + . . . + 0(n - j) = (j + l)C(n) - O(f) < (13 + l)0{n) (3.2) 

Therefore, the time complexity of the algorithm in worst case is (/3 + \)0(n) = 0(m + n). 

Average Case Complexity. Given n integers S[0 . . . n — 1], if m — fin and the integers 
are uniformly distributed, this means that j| integers satisfy S[i] < n. Therefore, the 
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algorithm is capable of sorting 5 integers in 0(n) time during first pass. This will continue 
until all the integers are sorted. The sum of sorted integers in each iteration can be 
represented with the series, 

n n(B-l) n(/3-l) k - 1 , , 

It is reasonable to think that the sorting ends when one term is left which means the sum 
of k terms of this series is equal to n — 1, from where we can calculate the number of 
iteration or dept of recursion k which is valid when (3 > 1 by, 

- = J 3.4 

n p k 

It is seen from Eqn. 13.41 that when m = 2n, i.e., — 2, number of iteration or dept of 
recursion becomes k = logn and the complexity is the recursion T{n) = T(|) + 0{n) 
yielding T(n) = 0(n). It is known that each step takes 0(n) time. Therefore, the time 
complexity of the algorithm is, 

0(n)+0( ^ ) + ... + 0( !iW_J^ ) (3,, 



from where we can obtain by defining x 



Q9-1) 



0(n)(l +x + x 2 + x 3 + ■■■ + x^ 1 ) = 0(n)(— < 0O(n) (3.6) 

' 1 — x 1 — X 

which means that the algorithm is upper bounded by (30(n) or 0(m) in the average case. 



4 Conclusions 

In-place associative permutation sort technique is proposed which solves the main dif- 
ficulties of distributive sorting algorithms by its inherent three basic steps namely (i) 



20 



practicing, (ii) permutation and (iii) restoring. It is very simple and straightforward and 
around 30 lines of C code is enough. 

From time complexity point of view, associative permutation sort shows similar char- 
acteristics with bucket sort. It sorts the keys associatively in 0(m) time for the average 
(uniformly distributed keys) and 0(n) time for the best case. Although its worst case 
time complexity is 0(n + m), it is upper bounded by 0(n 2 ) for the lists where m > n 2 . 
On the other hand, it requires only 0(1) additional space, making it time-space efficient 
compared to bucket sort. The ratio — defines its efficiency (time-space trade-offs) let- 
ting very large lists to be sorted in-place. Furthermore, the dependency of the efficiency 
on the distribution of the keys is 0(n) which means it replaces all the methods based 
on address calculation, that are known to be very efficient when the keys have known 
(usually uniform) distribution and require additional space more or less proportional to 
n. Hence, in-place associative permutation sort asymptotically outperforms all content 
based sorting algorithms when — < c and c is the efficiency constant defined by the other 
sorting algorithms regardless of how large is the list. 

The technique seems to be very flexible, efficient and applicable for other problems as 
well, such as hashing, searching, succinct data structures, gaining space, etc. 

The only drawback of the algorithm is that it is unstable. But, an imaginary subspace 
can create other subspaces and associations using the idle integers that were already prac- 
ticed by manipulating either their position or value or both. Hence, different approaches 
can be developed to solve problems such as stability. 
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5 Discussion 

Associative permutation sort first finds the minimum of the list and starts with the keys 
in [min(S'), min(S') + n — 1]. However, instead of starting with this interval, omitting the 
MSBs, if we consider a word as w — 1 bits and the most significant [logn] bits of a word 
as the key and the remaining bits as the satellite information, the problem reduces to 
sorting n integer keys S[0 . . . n — 1] each in the range [0, 2^ ogn ^ — 1]. Since it is possible 
that 2 T lo s ^1 — i > n — 1, the keys in [n, 2^ logn ^ — 1] become the keys that are out of the 
practiced interval. 

As a result, when the keys are sorted according to their most significant [logn] bits, 
in-place associative most significant radix permutation sort is obtained. After the list is 
sorted according to their most significant [log n\ bits, the idle keys are grouped and each 
group is preceded by the corresponding node that has practiced them. Hence, each group 
can be sorted sequentially or recursively assuming the satellite information as the key. If 
itself is used, it becomes an algorithm based on hash- and- conquer paradigm in contrast 
to divide- and- conquer. However, size of subgroups decreases and it may not be efficient 
when the ratio of range of keys in each subgroup to size of that subgroup, i.e., — increases. 
Hence, other strategies may need to be developed after the first pass. 
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